Bridging the Gap between Naive Bayes and Maximum Entropy Text Classification
نویسندگان
چکیده
Abstract. The naive Bayes and maximum entropy approaches to text classification are typically discussed as completely unrelated techniques. In this paper, however, we show that both approaches are simply two different ways of doing parameter estimation for a common log-linear model of class posteriors. In particular, we show how to map the solution given by maximum entropy into an optimal solution for naive Bayes according to the conditional maximum likelihood criterion.
منابع مشابه
A New Approach for Text Documents Classification with Invasive Weed Optimization and Naive Bayes Classifier
With the fast increase of the documents, using Text Document Classification (TDC) methods has become a crucial matter. This paper presented a hybrid model of Invasive Weed Optimization (IWO) and Naive Bayes (NB) classifier (IWO-NB) for Feature Selection (FS) in order to reduce the big size of features space in TDC. TDC includes different actions such as text processing, feature extraction, form...
متن کاملA Survey Paper On Naive Bayes Classifier For Multi-Feature Based Text Mining
Text mining is variance of a field called data mining. To make unstructured data workable by the computer Text mining is used which is also referred as “Text Analytics”. Text categorization, also called as topic spotting is the task of automatically classifies a set of documents into groups from a predefined set. Text classification is an essential application and research topic because of incr...
متن کاملUsing Maximum Entropy for Text Classification
This paper proposes the use of maximum entropy techniques for text classification. Maximum entropy is a probability distribution estimation technique widely used for a variety of natural language tasks, such as language modeling, part-of-speech tagging, and text segmentation. The underlying principle of maximum entropy is that without external knowledge, one should prefer distributions that are...
متن کاملClassifying Linux Shell Commands using Naive Bayes Sequence Model
Using Linux shell commands is a challenging task for most of the people new to Linux. This paper presents the idea of conversion of natural language to equivalent Linux shell command. To achieve the conversion we make use of a Naive Bayes text classifier. However there could be a case of a series of flags and combination of commands. This is handled by a sequence of Naive Bayes text classifier....
متن کاملClassification of Text Documents Based on Minimum System Entropy
In this paper, we describe a new approach to classification of text documents based on the minimization of system entropy, i.e., the overall uncertainty associated with the joint distribution of words and labels in the collection. The classification algorithm assigns a class label to a new document in such a way that its insertion into the system results in the maximum decrease (or least increa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007